10 research outputs found

    He Said, She Said: Style Transfer for Shifting the Perspective of Dialogues

    Full text link
    In this work, we define a new style transfer task: perspective shift, which reframes a dialogue from informal first person to a formal third person rephrasing of the text. This task requires challenging coreference resolution, emotion attribution, and interpretation of informal text. We explore several baseline approaches and discuss further directions on this task when applied to short dialogues. As a sample application, we demonstrate that applying perspective shifting to a dialogue summarization dataset (SAMSum) substantially improves the zero-shot performance of extractive news summarization models on this data. Additionally, supervised extractive models perform better when trained on perspective shifted data than on the original dialogues. We release our code publicly.Comment: Findings of EMNLP 2022, 18 page

    Unlimiformer: Long-Range Transformers with Unlimited Length Input

    Full text link
    Since the proposal of transformers, these models have been limited to bounded input lengths, because of their need to attend to every token in the input. In this work, we propose Unlimiformer: a general approach that wraps any existing pretrained encoder-decoder transformer, and offloads the cross-attention computation to a single k-nearest-neighbor (kNN) index, while the returned kNN distances are the attention dot-product scores. This kNN index can be kept on either the GPU or CPU memory and queried in sub-linear time; this way, we can index practically unlimited input sequences, while every attention head in every decoder layer retrieves its top-k keys, instead of attending to every key. We evaluate Unlimiformer on several long-document and book-summarization benchmarks, showing that it can process even 500k token-long inputs from the BookSum dataset, without any input truncation at test time. We demonstrate that Unlimiformer improves pretrained models such as BART and Longformer by extending them to unlimited inputs without additional learned weights and without modifying their code. We make our code and models publicly available at https://github.com/abertsch72/unlimiformer .Comment: NeurIPS 202

    Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

    Full text link
    Many recent advances in natural language generation have been fueled by training large language models on internet-scale data. However, this paradigm can lead to models that generate toxic, inaccurate, and unhelpful content, and automatic evaluation metrics often fail to identify these behaviors. As models become more capable, human feedback is an invaluable signal for evaluating and improving models. This survey aims to provide an overview of the recent research that has leveraged human feedback to improve natural language generation. First, we introduce an encompassing formalization of feedback, and identify and organize existing research into a taxonomy following this formalization. Next, we discuss how feedback can be described by its format and objective, and cover the two approaches proposed to use feedback (either for training or decoding): directly using the feedback or training feedback models. We also discuss existing datasets for human-feedback data collection, and concerns surrounding feedback collection. Finally, we provide an overview of the nascent field of AI feedback, which exploits large language models to make judgments based on a set of principles and minimize the need for human intervention.Comment: Work in Progres

    LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs

    Full text link
    LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these "human computation algorithms," but the level of success is variable and influenced by requesters' understanding of LLM capabilities, the specific skills required for sub-tasks, and the optimal interaction modality for performing these sub-tasks. We reflect on human and LLMs' different sensitivities to instructions, stress the importance of enabling human-facing safeguards for LLMs, and discuss the potential of training humans and LLMs with complementary skill sets. Crucially, we show that replicating crowdsourcing pipelines offers a valuable platform to investigate (1) the relative strengths of LLMs on different tasks (by cross-comparing their performances on sub-tasks) and (2) LLMs' potential in complex tasks, where they can complete part of the tasks while leaving others to humans

    Toxicogenomic responses of Caenorhabditis elegans to pristine and transformed zinc oxide nanoparticles

    Get PDF
    Manufactured nanoparticles (MNPs) undergo transformation immediately after they enter wastewater treatment streams and during their partitioning to sewage sludge, which is applied to agricultural soils in form of biosolids. We examined toxicogenomic responses of the model nematode Caenorhabditis elegans to pristine and transformed ZnO-MNPs (phosphatized pZnO- and sulfidized sZnO-MNPs). To account for the toxicity due to dissolved Zn, a ZnSO4 treatment was included. Transformation of ZnO-MNPs reduced their toxicity by nearly ten-fold, while there was almost no difference in the toxicity of pristine ZnO-MNPs and ZnSO4. This combined with the fact that far more dissolved Zn was released from ZnO- compared to pZnO- or sZnO-MNPs, suggests that dissolution of pristine ZnO-MNPs is one of the main drivers of their toxicity. Transcriptomic responses at the EC30 for reproduction resulted in a total of 1161 differentially expressed genes. Fifty percent of the genes differentially expressed in the ZnSO4 treatment, including the three metal responsive genes (mtl-1, mtl-2 and numr-1), were shared among all treatments, suggesting that responses to all forms of Zn could be partially attributed to dissolved Zn. However, the toxicity and transcriptomic responses in all MNP treatments cannot be fully explained by dissolved Zn. Two of the biological pathways identified, one essential for protein biosynthesis (Aminoacyl-tRNA biosynthesis) and another associated with detoxification (ABC transporters), were shared among pristine and one or both transformed ZnO-MNPs, but not ZnSO4. When comparing pristine and transformed ZnO-MNPs, 66% and 40% of genes were shared between ZnO-MNPs and sZnO-MNPs or pZnO-MNPs, respectively. This suggests greater similarity in transcriptomic responses between ZnO-MNPs and sZnO-MNPs, while toxicity mechanisms are more distinct for pZnO-MNPs, where 13 unique biological pathways were identified. Based on these pathways, the toxicity of pZnO-MNPs is likely to be associated with their adverse effect on digestion and metabolism

    A multiple myeloma classification system that associates normal B-cell subset phenotypes with prognosis.

    No full text
    Despite the recent progress in treatment of multiple myeloma (MM), it is still an incurable malignant disease, and we are therefore in need of new risk stratification tools that can help us to understand the disease and optimize therapy. Here we propose a new subtyping of myeloma plasma cells (PCs) from diagnostic samples, assigned by normal B-cell subset associated gene signatures (BAGS). For this purpose, we combined fluorescence-activated cell sorting and gene expression profiles from normal bone marrow (BM) Pre-BI, Pre-BII, immature, naĂŻve, memory, and PC subsets to generate BAGS for assignment of normal BM subtypes in diagnostic samples. The impact of the subtypes was analyzed in 8 available data sets from 1772 patients' myeloma PC samples. The resulting tumor assignments in available clinical data sets exhibited similar BAGS subtype frequencies in 4 cohorts from de novo MM patients across 1296 individual cases. The BAGS subtypes were significantly associated with progression-free and overall survival in a meta-analysis of 916 patients from 3 prospective clinical trials. The major impact was observed within the Pre-BII and memory subtypes, which had a significantly inferior prognosis compared with other subtypes. A multiple Cox proportional hazard analysis documented that BAGS subtypes added significant, independent prognostic information to the translocations and cyclin D classification. BAGS subtype analysis of patient cases identified transcriptional differences, including a number of differentially spliced genes. We identified subtype differences in myeloma at diagnosis, with prognostic impact and predictive potential, supporting an acquired B-cell trait and phenotypic plasticity as a pathogenetic hallmark of MM
    corecore